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CROSS-REFERENCE TO RELATED APPLICATION 

The present application is related to concurrently filed U.S. patent 

5 application No.: xx/xxx,xxx (Attorney Docket No. M-8832 US), entitled "Look-Up 
Table Addressing Scheme" of Yuan et aL; concurrently filed U.S. patent 
application No.: xx/xxx,xxx (Attorney Docket No. M-8833 US), entitled "Look-up 
Table Index Value Generation in a Turbo Decoder" of Zhang et aL; and 
concurrently filed U.S. patent application No.: xx/xxx,xxx (Attorney Docket No. 

1 0 M-8928 US), entitled "A Stop Iteration Criterion for Turbo Decoding" of Yuan et 
aL The applications referenced herein and the present application are commonly 
assigned and have at least one common inventor. 

BACKGROUND OF THE INVENTION 
15 1 . Field Of The Invention 

The invention generally relates to the field of error correction codes for 
commimication systems, and in particular, the present invention relates to 
implementations of turbo decoding methods and systems. 

20 2. Background of the Invention 

In digital commimication systems, information (such as data or voice) are 
transmitted through a channel which is often noisy. The noisy channel introduces 
errors in the information being transmitted such that the information received at the 
receiver is different fi-om the information transmitted. To reduce the probability 

25 that noise in the chaimel could corrupt the transmitted information, communication 
systems typically employ some sort of error correction scheme. For instance, 
wireless data coromunication systems, operated in a low signal to noise ratio 
(SNR) environment, typically employ forward error correction (FEC) schemes. 
When FEC coding is used, the transmitted message is encoded with sufficient 
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redundancies to enable the receiver to correct some of the errors introduced in the 
received message by noise in the commimication chaimel. 

Various FEC coding schemes are known in the art. In particular, turbo 
codes are a type of FEC codes that are capable of achieving better error 
performance than the conventional FEC codes. In fact, it has been reported that 
turbo codes could come v^ithin 0.7 dB of the theoretical Shannon limit for a bit 
error rate (BER) of 10"^ Because turbo codes can achieve exceptionally low error 
rates in a low signal-to-noise ratio environment, turbo codes are particularly 
desirable for use in wireless communications where the communication channels 
are especially noisy as compared to wired conmnmications. In fact, the recent 
CDMA wireless communications standard includes turbo codes as one of the 
possible encoding scheme. For a detailed description of turbo coding and decoding 
schemes, see '"Near Shannon limit error-correcting coding and decoding: Turbo- 
codes (l)r Berrou et al., Proc, IEEE Int'l Conf. on Communications, Geneva, 
Switzerland, pp. 1064-1070, 1993, and "Iterative decoding of binary block and 
convolutional codes," Hagenauer et al., IEEE Trans. Inform. Theory, pp. 429-445, 
March 1996, which are incorporated herein by reference in their entireties. In 
brief, turbo codes are the parallel concatenation of two or more recursive 
systematic convolutional codes, separated by pseudorandom interleavers. 
Decoding of turbo codes involves an iterative decoding algorithm. 

While turbo codes have the advantage of providing high coding gains, 
decoding of turbo codes is often complex and involves a large amount of complex 
computations. Tvirbo decoding is typically based on the maximum a posteriori 
(MAP) algorithm which operates by calculating the maximum a posteriori 
probabilities for the encoded data. While it has been recognized that the MAP 
algorithm is the optimal decoding algorithm for turbo codes, it is also recognized 
that implementation of the MAP decoding algorithm is very difficult in practice 
because of its computational complexities. To ease the computational burden of 
the MAP algorithm, approximations and modifications to the MAP algorithm have 
been developed. These include the Max-Log-MAP algorithm and the Log-MAP 
algorithm. The MAP, Max-Log-MAP and Log-MAP algorithms are described in 
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detail in "A Comparison of Optimal and Sub-Optimal MAP Decoding Algorithms 
Operating in the LogDomain;' Robertson et al., IEEE Int'l Conf. on 
Communications (Seattle, WA), Jxme, 1995, which is incorporated herein by 
reference in its entirety. 

5 The MAP algorithm provides the logarithm of the ratio of the a posteriori 

probability (APP) of each information bit being "1" to the APP of the data bit 
being "0." The probability value is given by equation (1) of Robertson et al. The 
computation of the APP requires computing the forward recursion (ak(0), the 
backward recursion (pk( )), and the branch transition probabilities (denoted yi(-) in 

10 Roberston et al.). To reduce the computational complexity of the MAP algorithm, 
the Max-Log-MAP and Log-MAP algorithms perform the entire decoding 
operation in the logarithmic domain. In the log domain, multiplication operations 
become addition operations, thus simplifying numeric computations involving 
multiplication. However, the addition operations in the non-log domain become 

1 5 more complex in the log domain. For example, the summation of two metric 
e''' and e""' is straight forward in the non-log domain and is accomplished by 
adding the two metric e""' + e""^ . But in the Log-MAP algorithm, the metric that is 
being calculated is Xj and . In order to add the two metric in the non-log 
domain, the metric and must first be converted to the non-log domain by 

20 taking the exponential, then adding the exponentiated metric, and finally taking the 
logarithm to revert back to the log domain. Thus, the sum of metric Xi and X2 is 
computed as: log(e''^ +e'''). Equivalently, the computation can be rewritten as: 

log(e''' +^^^)=maxrxi,X2; + /og(l + e-'^^""^'), (i) 

25 

which can be simplified by approximating the fimction /og(l + e'^"^'""'^ ) by a look- 
up table. Thus the approximation for the sum of the two metric is: 

logle""' +e''')^ max(x^ , x^ ; + log^^^^i^ d - X2 , (ii) 

30 
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where log,^^^^ (| x, - |) is an N-entry look-up table. It has been shown that as 
few as 8 entries is sufficient to achieve negligible bit error or frame error 
degradation. The look-up table, logtabieQ^i-^ilX is one-dimensional because the 
correction values only depend on the argument |xi-X2|. Figure 1 illustrates an 
5 exemplary N-entry look-up table used in the computation of equation (ii) above in 
the Log-MAP decoding algorithm. In Figure 1, look-up table 100 includes two 
data fields. Data field 102 includes N entries of the table indexes z, denoted as zq, 
zi, and zn-1, where z is. defined as |xi-X2|. Data field 104 includes the 
corresponding N entries of the computed table values of log^WeCz). denoted as ao, 
1 0 au and a^-u which are the computed values of the equation log(l + ). To 
O address look-up table 100 for a given value of z, the value z is compared to the 

^fl defined ranges of the table indexes zo, zu .... and zn-i to determine in which 

! S threshold range z belongs. The defined ranges of table thresholds are as follows: 



log,«6/e(z) = ^o. 

l0g,^6/e(z) = ^i' 
l0g,«6fe(z)==^2^ 

When the correct table threshold range is identified, for example, when z is within 
the range of zi and Z2 (data cell 103), the value ai (data cell 105) will be returned 
by look-up table 100. 

However, improvements to turbo decoding using the Log-MAP logarithms 
20 are desired to fiirther reduce the complexity of the turbo decoder and to reduce the 
decoder processing time. 

SUMMARY OF THE INVENTION 

A method is provided for computing the fimction log(e''^ + e''^ ) or 
25 Inie""' + e""' ) for a first argument value xi and a second argument value X2. The 

method includes generating a table having a first data field and a second data field. 
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The first data field includes N-entry of table index values selected from a range of 
IX1-X2I argument values and scaled by a scaling factor. The second data field 
includes N-entry of computed table values computed based on the equation 
log(l + e'^""'""'^ ) or ln(l + e"'''^"'''' ) for each of the |xi-X2| argument values selected 
5 for the table index values. The computed table values are also scaled by the same 
scaling factor. 

According to another aspect of the present invention, the function 
logie""' + ^""0 or Inie""' + e""' ) is computed by first computing an index value z - 
|xi- X2I. Then, the index value z is compared with the table index values in the first 

1 0 data field of the table to determine in which one of the table index values the index 
value z belongs. The table then returns a first computed table value from the 
computed table values in the second data field corresponding to the one table index 
value the index value z belongs. The first computed table value is added to the 
greater of the first argument value xi and the second argument value X2. 

1 5 According to another embodiment of the present invention, a decoder for 

decoding input data is provided. The decoder implements the maximum a 
posteriori probability decoding algorithm using a scaled look-up table for 
computing the fimction logie"' + e""' ) or InC^"^^ + e""' ) where xi and X2 are first and 
second argument values, each derived firom the input data which have not been 

20 scaled. The table stores a first data field including N entries of table index values 
and a second data field including N entries of computed table values corresponding 
to the N entries of table index values. The N entries of table index values are 
selected from a predefined range of |xi- X2I argument values and scaled by a first 
scaling factor. The N entries of computed table values are computed based on the 

25 equation log(l + e"'"'"'''' ) or ln(l + ) for each of the |xi- X2I argument values 

selected for the table index values, and scaled by the first scaling factor. 
According to the present invention, a method and an apparatus are provided to 
enable a turbo decoder to perform both the scaling operation and the decoding 
operation with greater efficiency. 
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The present invention is better understood upon consideration of the 
detailed description below and the accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates an exemplary N-entry look-up table used in the Log- 
MAP decoding algorithm. 

Figure 2 is a block diagram of a turbo decoder according to one 
embodiment of the present invention. 

Figure 3 is a block diagram illustrating a complete iteration of the decoding 
operation of turbo decoder 200. 

Figure 4 is a scaled look-up table according to one embodiment of the 
present invention. 

Figure 5 is a block diagram of a receiver incorporating a quantizer and a 
turbo decoder according to one embodiment of the present invention. 
Figure 6 illustrates a 4-bit uniform quantizer. 

Figure 7 is a scaled look-up table according to another embodiment of the 
present invention. 

Figure 8 is a wireless receiver incorporating a turbo decoder according to 
one embodiment of the present invention. 

Figure 9 is a 2N-entry look-up table with modified table threshold 
conditions for use in the Log-MAP decoding algorithm according to one 
embodiment of the present invention. 

Figure 10 is a block diagram illustrating one exemplary implementation of 
an index value generation circuit for computing the index value z = jxi- X2|. 

Figure 1 1 is a block diagram illustrating an mdex value generation circuit 
for computing the index value z = \xi- X2I according to one embodiment of the 
present invention. 

Figure 12 is a block diagram of a turbo decoder incorporating the stop 
iteration criterion of the present invention in its decoding operation according to 
one embodiment of the present invention. 
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Figure 13 is a block diagram illustrating a complete iteration of the 
decoding operatinof the turbo decoder of Figure 12. 

In the present disclosure, like objects which appear in more than one figure 
are provided with like reference numerals. 

5 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
An Implementatioti of a Turbo Decoder 

In a digital communication system employing turbo codes, information bits 
to be transmitted over a communication channel is encoded as an information 

10 sequence (also called systematic information) and two or more parity sequences 
(also called parity information). The information sequence and the parity 
sequences are multiplexed to form the code word. A turbo encoder includes two 
or more constituent encoders for generating the parity sequences. Typically, the 
constituent encoder of the turbo encoder is a recursive systematic convolutional 

1 5 encoder. Turbo encoding is described in detail in the aforementioned article by 
Berrou et al, "Near Shannon limit error-correcting coding and decoding: turbo 
codes." The output of the turbo encoder can be punctured in order to increase the 
code rate. When puncturing is used, a predetermined pattern of bits is removed 
from the code word. After encoding, the code word is modulated according to 

20 techniques known in the art and transmitted over a noisy communication channel, 
either wired or wireless. In the present embodiment, an AWGN ( additive white 
Gaussian noise) communication channel with one-sided noise power spectral 
density No is assumed. 

When the transmitted code word is received by a receiver, the received data 

25 stream is demodulated, filtered, and sampled in accordance with techniques known 
in the art. The received data stream is then separated into a received information 
data stream and a received parity data stream and both are provided to a turbo 
decoder for decoding. Figure 2 is a block diagram of a turbo decoder according to 
one embodiment of the present invention. Turbo decoder 200 includes a frame 

30 buffer 202 for storing input data, including both the systematic and parity 

information, received on the conmiunication channel. In Figure 2, input data on 
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bus 212 has already been demodulated, filtered, and sampled by a signal processor 
(not shown) according to techniques known in the art. In one embodiment, the 
input data is stored in a 4-bit two's complement format. Additional information 
necessary for the decoding operation is also provided to turbo decoder 200. The 
5 additional information can include, but is not limited to, the frame size of the input 
data (bus 214), the puncturing table (bus 216), the signal to noise ratio Es/No (bus 
218), and the first quantizer level Qx[0] (bus 220). The quantizer level information 
is optional and is needed only when the input data is quantized for fixed point 
processing. The puncturing table information is also optional and is needed only 

1 0 when puncturing is used in the turbo encoding process. 

Turbo decoder 200 farther includes a decoder 204, an interleaver 206 and a 
deinterleaver 208. Decoder 204 is an elementary decoder implementing the Log- 
MAP decoding algorithm for computing the a posteriori probabilities (APP) of the 
individual information bits. Decoder 204 performs metric computations as detailed 

15 in Robertson et al. which include three major computational components: the 

forward probability ak, the backward probability Pk, and the extrinsic information 
Pk. Decoder 204 produces a soft information, denoted Pk, for the systematic 
information received on output bus 222. Output bus 222 is coupled to a switch 210 
which alternately connects soft information Pk to either the input port 226 of 

20 interleaver 206 or the input port 224 of deinterleaver 208. Switch 210 is provided 
to enable the use of one decoder (decoder 204) in the two-stage iterative decoding 
process of turbo decoding. The detailed operation of turbo decoder 200 will be 
explained in more detail below in reference to Figure 3, The decoding process 
continues for a predefined number of iterations and final bit decisions are made at 

25 the end of the last iteration. Decoder 204 then provides bit decisions on output bus 
228 which represents the information bits decoded by the turbo decoder 200. 

Figure 3 is a block diagram illustrating a complete iteration of the decoding 
operation of turbo decoder 200. After the received data is demodulated, the 
received input data (bus 212) is provided to fi:ame buffer 202 for storage. When 

30 appropriate, the input data is first depunctured by depuncture block 324 and 
depuncture with interleaver block 325 using the puncturing table information 



-8- 



■ lfiiiiiimi«"iK*"i"<iifni<i>iili!ii)i'i![ I 



M-7976 US 
777791 vl 



supplied to the decoder on bus 216 (Figure 2), Each iteration of the turbo decoding 
process consists of two stages of decoding. The first stage of decoding, performed 
by decoder 332, operates on the systematic information RO (bus 326), encoded 
information ROO and ROl (buses 327 and 328) representing the encoded bits 
5 generated by a first encoder in the turbo encoder which encoded the message, and a 
posteriori information P2 (bus 323) computed in the second stage and 
deinterleaved by deinterleaver 208. The second stage of decoding, performed by 
decoder 334, operates on the interleaved systematic information Rl (bus 329), 
encoded information RIO, and Rl 1 (buses 330 and 331) representing the encoded 
10 bits generated by a second encoder in the turbo encoder, and a posteriori 

information PI (bus 322) computed in the first decoding stage and interleaved by 
interleaver 206. Data sequences Rl, RIO, and Rl 1 from frame buffer 202 are 
depunctured by depuncture with interleaver block 325 before being provided to 
decoder 334. 

15 In operation, the a posteriori information PI (also called the extrinsic 

information), computed in the first stage by decoder 332, is stored in a buffer in 
interleaver 206. In the present embodiment, a posteriori information PI is stored in 
8-bit two's complement format. Similarly, the a posteriori information P2, 
computed in the second stage by decoder 334, is stored in a buffer in deinterleaver 

20 208. In the present embodiment, P2 is also stored in 8-bit two's complement 
format. After a predefined number of iterations of the decoding process, the 
resulting bit decisions are provided on bus 228. 

Because the two stages of decoding are identical except for the input data, 
decoder 332 and decoder 334 are identical elementary decoders. In fact, because 

25 the two decoders operate with different set of input data at different times, only one 
decoder block is needed in actual implementation of the turbo decoder, as shown in 
Figure 2. In turbo decoder 200 of Figure 2, switch 210 couples output bus 222 
to input port 226 of interleaver 206 during the first decoding stage. Upon 
completion of the first decoding stage, decoder 204 (fimctioning as decoder 332) 

30 stores extrinsic information PI in the buffer in interleaver 206 and switch 210 

switches from input port 226 to input port 224, thereby coimecting output bus 222 
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to deinterleaver 208. The second decoding stage proceeds with extrinsic 
information PI stored in interleaver 206 and decoder 204 (functioning as decoder 
334) generates extrinsic information P2 which is then stored in the buffer in 
deinterleaver 208 for use by decoder 204 in the next iteration of the decoding 
5 process. At the completion of the second decoding stage, switch 210 connects 
output bus 222 back to input port 226 of interleaver 206 for the next iteration of 
decoding. Therefore, only one decoder is actually needed to implement the two 
stage decoding process in turbo decoder 200. 

In Figures 2 and 3, the turbo decoder performs a two-stage iterative 

1 0 decoding process, either using one decoder for both stages or using two constituent 
decoders as shown in Figure 3. Of course, the turbo decoder of the present 
invention can include two or more decoding stages or constituent decoders. The 
number of decoding stages or constituent decoders is a function of the number of 
constituent encoders in the turbo encoder used to encode the input data. For a 

1 5 turbo encoder consisting of N constituent encoders, the turbo decoder will have the 
corresponding N number of constituent decoders or decoding stages in each 
iteration of the decoding process. 

As part of the decoding process, the received data is typically scaled 
appropriately by various parameters before metric calculations are carried out by 

20 the decoder. In one case, the scaling includes weighing the received data by the 
inverse of the noise variance of the communication channel. The weighing is 
necessary because of the Gaussian noise assumption. The noise variance a is 
derived from the signal-to-noise ratio information, Es/No, provided to turbo 
decoder 200 on lead 218. The signal-to-noise ratio Es/No of a communication 

25 channel is determined according to known estimation techniques. In another case, 
when quantization for fixed point processing is used, the log table entry is scaled 
by the first quantizer level Qx[0] (bus 220 of Figure 2) so that the input data and 
the table values are the same xmit. Furthermore, when the decoding operation is to 
be performed using fixed point processing, the received data may need to be scaled 

30 accordingly to ensure that the appropriate dynamic range and precision are 

achieved in the metric computation. In conventional turbo decoders, the scaling of 
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the received data is carried out by scaling the entire frame of the received data 
stored in the frame buffer. Because the received data includes large number of 
data bits, the scaling operation requires a large number of computations and 
introduces undesired latency into the decoding process. 
5 In accordance with the principles of the present invention, a method and an 

apparatus are provided to enable turbo decoder 200 to perform both the scaling 
operation and the decoding operation v^ith greater efficiency. In turbo decoder 200 
of the present invention, decoder 204 uses a scaled look-up table for computing the 
fimction log(e^' + ^'^^ ) in the Log-MAP decoding algorithm. The scaled look-up 

10 table of decoder 204 incorporates the scaling operation of the received data in the 
table entries. Because only entries in the scaled look-up table need to be scaled by 
the desired scale factors, as opposed to the entire frame of the received data, and 
because the scaled look-up table has only a few entries, a significant reduction in 
the nxmiber of computations is realized. The use of the scaled look-up table in 

1 5 turbo decoder 200 of the present invention provides for faster decoding operation 
and a less complex decoder implementation. 

As discussed above, the Log-MAP decoding algorithm requires computing 

the fimction log{e^' + e^^ ) or ln{e^' -h e^^ ) for a series of argument values xi and 
X2. In turbo decoder 200, the argument values xi and X2 are derived from the input 
20 data stored in frame buffer 202 which have not been scaled. As described above, 

the function log{e^' + e^' ) can be approximated by: 

logie""' + ) = max(xi, X2) + log(l -h ) , (iii) 

^ max(xi, X2) + log,_toi/e(|xi- X2I), (iv) 

25 

where the calculation of the second term of equation (iii) is accomplished by the 
use of the scaled look-up table, logs^tabieQ^i- X2I). The values stored in the scaled 
look-up table serve as a correction function to the first term of equation (iii) 
involving the maximization of the argument value xi and the argument value X2. 



-11- 



M-7976 US 
777791 vl 



An approximate computation of the function log(e^' +e^^) in the decoding 
operation is reaUzed through the use of equation (iv). The scaled look-up table is 
generated at the beginning of each frame of received data. In one embodiment, the 
scaled look-up table is stored in a memory location within decoder 204. The 
5 memory location can be implemented as typical memory devices such as a RAM 
and registers. In another embodiment, the scaled look-up table is stored as a 
logical circuit in decoder 204. 

One embodiment of the scaled look-up table of the present invention will 
now be described with reference to Figure 4. Scaled look-up table 400 is an N- 
10 entry precomputed look-up table and includes two data fields. Data field 402 
includes N entries of the table indexes (or table threshold values) z , denoted as 
Zq^Zj^ , and z^^^ . In table 400, table index z is scaled by the noise variance 

<y^ according to the following equation: 

15 z= z<7^, where z = |xi- X2I. (v) 

As mentioned above, argument values xi and X2 are derived from the input data 
which have not been scaled. The table indexes 2 are selected from a predefined 
range of |xi- X2I argument values. In one embodiment, the table indexes z are 
20 selected at regular intervals within the predefined range. 

Table 400 further includes a data field 404 which includes N entries of the 
computed table values, logs-tabiei'^ X according to the equation; 



Each entry of the computed table values, logs-tabki^X is computed based on z = 
Ixi- X2I and corresponds to each entry of the table index z . In scaled look-up table 
400, the computed table values of data field 404 are also scaled by the noise 
variance cr^, resulting in N entries of computed table values denoted as , , 




(vi) 



25 



30 «2 '■••5 and . By incorporating the scaling of the noise variance in scaled 
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look-up table 400, scaling of the entire frame of received data is circumvented and 
turbo decoder 200 can perform decoding computations with greater efficiency. 

Scaled look-up table 400 is addressed by first computing an index value z 
based on the argimient values xi and X2 according to the equation z = |xi- xij. The 
5 argimient values xi and X2 are derived from the input data in frame buffer 202 
which have not been scaled. Then, the index value z is compared with table 
indexes z in data field 402 to determine to which table index range the index 
value z belongs. The threshold conditions for scaled look-up table 400 is as 
follows: 

10 

z<Zi, log,,,/,(z) 

Zj <Z<Z2, 10g,,i;,(z) 
Z^ <Z<Z3, l0g,,^,,(z) 

2n-i log,,,,,(z) 
For example, if index value z satisfies the threshold condition Z; < z < Z2 ? ^^^^ 
index value z belongs to table index cell 406 of data field 402 and scaled look-up 
table 400 returns the computed value in cell 407 of data field 404. In this 
15 manner, turbo decoder 200 uses scaled look-up table 400 for metric calculations 
involving computing the function log(e^' -i-e^^). In accordance with the present 
invention, significant reduction in computations is achieved by providing scaled 
look-up table 400 which incorporates the scaling factor for the received input 
data, as opposed to scaling the entire frame of input data. 
20 The scaled look-up table according to the present invention can also be 

applied when fixed point processing is used in the turbo decoding process. In fixed 
point processing, the input data is quantized to a fixed number of levels and each 
level is represented by an n-bit quantizer value. Figure 5 is a block diagram of a 
receiver incorporating a quantizer and a turbo decoder according to one 
25 embodiment of the present invention. Received data on bus 212 is provided to 
quantizer 504 after the data has been demodulated by a demodulator (not shown), 
filtered and sampled according to techniques known in the art. Quantizer 504 
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provides to turbo decoder 500 an n-bit quantizer value for each bit of input data on 
bus 505. Quantizer 504 also provides the first quantizer level Qx[0] on bus 520 to 
turbo decoder 500. The first quantizer level Qx[0] is provided to enable turbo 
decoder 500 to derive the quantizer output level fi*om the n-bit quantizer value. 
5 Turbo decoder 500 is implemented in the same manner as turbo decoder 200 of 
Figure 2 and provides bit decisions on output bus 528. 

In one embodiment, quantizer 504 is implemented as a 4-bit uniform 
quantizer as shown in Figure 6. The quantizer input threshold level, Qt[i], is 
shown on the x-axis. For each quantizer input threshold levels the corresponding 

10 quantizer output level, Qx[i], is shown on the y-axis. Each quantizer output level 
Qx[i] is given a 4-bit quantizer value representation. For example, in Figure 6, the 
4-bit quantizer value for quantizer output level Qx[0] is 0000, for Qx[l] is 0001, 
and so on. In the quantizer of Fig. 6, the first quantizer level Qx[0] is a half level, 
therefore, each of the subsequent quantizer output level Qx[i] is an odd multiple of 

15 the first quantizer level given as follows: 

Qx[i] = Qx[0](2y+l), (vii) 

where y is the numeric value of the 4-bit quantizer value received on bus 505. The 
20 same quantization rule applies to negative input values. In addition, the quantizer 
value may also need to be scaled appropriately to achieve the desired dynamic 
range and precision. When dynamic range adjustment is applied, the quantizer 
output used in the metric calculation is: 

25 (2y+l)p, (viii) 

where y is the numeric value of the 4-bit quantizer value received on bus 505 and p 
denotes the scale factor for dynamic range adjustment. In this instance, turbo 
decoder 500 needs to scale the input data according to equation (x) above before 
30 metric calculations are carried out. According to another embodiment of the 

present invention, turbo decoder 500 uses a scaled look-up table 700 (Figure 7) for 
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metric calculations involving computing the function log(€^^ -\-e^^) in the Log- 
MAP decoding algorithm. Scaled look-up table 700 is an N-entry precomputed 
scaled look-up table and includes two data fields. Data field 702 includes N entries 
of the table indexes z' , denoted as Zq, z[, Z2, and z'j^__j. In scaled look-up 

5 able 700, table index i is scaled by the noise variance a^, the first quantizer level 

Qx[0], and scaling factor p for dynamic range adjustment according to the 
following equation: 

z' = zpa^/Qx[0], (ix) 

10 

where z = |xi- xij. The table indexes z' are selected from a predefined range of 

[xi- X2I argument values. In one embodiment, the table indexes z' are selected at 

regular intervals within the predefined range. 

Table 700 further includes a data field 704 which includes N entries of the 
1 5 computed table values, \ogs-fabie{ z^ % according to the equation: 

log,„,,,;, (z') = log(l + e'^ )pa VQx[0] . (x) 

Each entry of the computed table values, logs-tabie{ ^ )? is computed based on z = 
20 |xi- X2I and corresponds to each entry of the table indexes / . In scaled look-up 

table 700, the computed table values of data field 704 are scaled by the noise 
variance ctI the first quantizer level Qx[0], and scahng factor p, resulting in N 
entries of computed table values denoted as , a[ , and , In the present 

embodiment, the scale factor for dynamic range adjustment p is chosen to be the 
25 powers of two to simplify the multiplication process. When dynamic range 

adjustment is used, the scale factor for dynamic range adjustment p is applied to 
both the input data and to scaled look-up table 700, In conventional systems, the 
computational btirden is heavy because the turbo decoder has to scale the input 
data by the noise variance and the dynamic range adjustment separately. In 
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accordance with the present invention, the input data only needs to be scaled by the 
dynamic range adjustment scale factor. Furthermore, because the scale factor p is 
a power of two, the multiplication process of the input data involves simply a bit- 
shifting operation. Thus, by incorporating the scaling of the noise variance and 
5 the first quantizer level Qx[0] in scaled look-up table 700, scaling of the entire 
frame of received data is circumvented. Turbo decoder 500 only has to scale the 
received data by scaling factor p which is a simple bit-shifting operation. Thus, 
turbo decoder 500 can be operated at the same high level of efficiency as turbo 
decoder 200. 

10 Scaled look-up table 700 is addressed in the same manner as scaled look-up 

table 400. First, an index value z based on the argument values xi and X2 according 
to the equation z = |xi- X2I is computed. The argument values xi and X2 are derived 
from the input data in frame buffer 202 which have not been scaled. Then, the 
index value z is compared with table indexes / in data field 702 to determine to 

1 5 which table index range the index value z belongs. The table threshold conditions 
for scaled look-up table 700 is given as follows: 



z<z;, 

Zj < Z < Z2 5 
Z2 < Z < Z3 , 



log,,,y,(z) = a^, 

iOg,„w.(z) = «N-l- 



20 Thus, if index value z satisfies the threshold condition z[ <z< Z2 , then index 
value z belongs to table index cell 706 of data field 702 and scaled look-up table 
700 returns the computed value a[ in cell 707 of data field 704. In this manner, 
turbo decoder 500, which operates in fixed point processing, uses scaled look-up 
table 700 for metric calculations involving computing the function log(e^' -he^^) 

25 In accordance with the present invention, significant reduction in the amount of 
computations is achieved by providing scaled look-up table 700 which 
incorporates scaling factors for the noise variance, the quantizer level, and the 
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dynamic range control, circumventing the need to scale the entire frame of 
received input data. 

Applications of the turbo decoder of the present invention can be found in 
receivers where information is being transmitted over a noisy communication 
5 channel. An exemplary application of the turbo decoder of the present invention is 
illustrated in Figure 8. Wireless receiver 850 can be a wireless telephone, a pager 
or other portable personal information devices. Wireless receiver 850 receives 
digital information transmitted over the communication channel via antenna 852. 
The received data are demodulated by demodulator 853 and filtered and sampled 

10 by Filter/Match/Sample unit 854. The received data are provided to turbo decoder 
800 via bus 812. Turbo decoder 800 includes decoder 804 according to the present 
invention which uses a scaled look-up table stored in a memory vmit 805 for metric 
calculations during the decoding process based on the Log-MAP decoding 
algorithm. The computation circuits of decoder 804, represented as computation 

1 5 imit 806, access the scaled look-up table by addressing memory unit 805. Turbo 
decoder 800 provides the corrected received data on bus 828. 

The turbo decoder of the present invention can be constructed as an 
appKcation specific integrated circuit (ASIC) or as a field-progranmiable gate- 
array (FPGA) or a digital signal processor (DSP) software or using other suitable 

20 means known by one skilled in the art. One of ordinary skill in the art, upon being 
apprised of the present invention, would know that other implementations can be 
used to practice the methods and apparatus of the present invention. The scaled 
look-up table of the present invention can be generated by a processor external to 
the turbo decoder and downloaded into the decoder during data processing of the 

25 input data. The scaled look-up table can also be generated within the turbo 
decoder with the decoder performing the necessary scaling functions. 

Although the present invention has been described above with reference to 
a specific application in a turbo decoder, the present invention is not intended to be 
limited to appKcations in turbo decoders only. In fact, one of ordinary skill in the 

30 art would understand that the method and apparatus of the present invention can be 
applied to any systems performing metric computations of the function 
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log(e^' + ... + e^" ) in order to simplify the computation process and to achieve the 
advantages, such as processing speed enhancement, described herein. 
Furthermore, the present invention has been described as involving the 

computation of log(e'^^ + ... + e^" ) ; however, one of ordinary skill in the art would 
5 have understood that the same method and apparatus can be applied to the 
computation of ln(e^' + ... + ) and that the logarithm and natural logarithm 
expression of the above equations are interchangeable. The above detailed 
descriptions are provided to illustrate specific embodiments of the present 
invention and are not intended to be limiting. Nximerous modifications and 
10 variations within the scope of the present invention are possible. 

Look-Up Table Addressing Scheme 

As described above, in turbo decoding using the Log-MAP algorithm, an 
N-entry look-up table is used as a correction factor to approximate the operation 

1 5 stated in equation (i) above. To address the look-up table, whether unsealed (table 
100 of Figure 1) or scaled (tables 400 and 700 of Figures 4 and 7), the index value 
z (defined as |xi-X2|) is compared with the table indexes (or table thresholds) to 
determine in which threshold range z belongs. For example, referring to table 1 00 
of Figxire 1, in the turbo decoding process, a given z value is first compared with 

20 zi, then with Z2, Zs, and so on until the correct table threshold range is identified. 
Note that typically zo is set to be zero as the values of z are non-negative. Since a 
given look-up table is addressed repeatedly in the turbo decoding process, the 
comparison process can be time consuming, particularly for large entry look-up 
tables. Furthermore, the hardware implementation involving comparators can be 

25 complex. 

In accordance with another aspect of the present invention, a look-up table 
addressing scheme is provided for improving the speed of the table look-up 
operation and for significantly reducing the complexity of the hardware 
implementation of the look-up table. The look-up table addressing scheme of the 
30 present invention uses linearly spaced thresholds for the table indexes and allows 
the look-up table to be addressed using address bits extracted fi*om the index value 
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z. In this manner, the table look-up operation can be performed quickly since no 
comparisons are needed in the table look-up operation. Furthermore, the look-up 
table addressing scheme of the present invention can be applied to turbo decoding 
using the Log-MAP algorithm as described above to achieve improvements in the 
5 overall performance of the turbo decoding operation. 

The look-up table addressing scheme of the present invention generates a 
modified look-up table based on the original look-up table. Figure 9 illustrates a 
modified 2N-entry precomputed look-up table according to one embodiment of the 
present invention. In Figure 9, look-up table 900 is generated from an unsealed 

10 look-up table (such as table 100 of Figure 1), This is illustrative only and the 

look-up table addressing scheme can be applied to a scaled look-up table as well, 
such as table 400 and table 700 described above to achieve the same result in 
performance improvement. 

In the present embodiment, modified look-up table 900 is derived from 

15 look-up table 100 of Figure 1 having table indexes zo, zi, and z^-i in data field 
102 and computed table values Oq, au and a^-i in data field 104. Of course, if a 
scaled look-up table is desired, then an N-entry scaled look-up table such as table 
400 or table 700 is first generated and modified look-up table 900 can be derived 
from the scaled look-up table in the same manner the table is derived from an 

20 unsealed look-up table. For the purpose of generating modified table 900, the 
number of entries, N, of the original look-up table is assumed to be a power of 2. 
If the number of entries N of the original look-up table is not a power of 2, then the 
look-up table can be padded with additional entries having the same value as the 
last entry to make the total number of entries a power of 2. 

25 Modified look-up table 900 has 2N entries, that is, it has twice the nimber 

of entries as the original look-up table 100. Of course, table 900 can have other 
numbers of entires, as long as the number of entries is a power of 2, such as 4N. 
One of ordinary skill in the art would know how to apply the present invention to a 
modified look-up table having the appropriate number of entries. In the original 

30 look-up table 100, the table indexes z are not necessarily evenly spaced. The 
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computed table values ao, au and a^-i are evaluated for each of the respective 
table indexes zo, Zi, and zn-i. 

In modified table 900, the 2N entries of the table uidexes z , denoted as 
, , 5 and ^2jv_i , are linearly spaced from a value of Zq (typically 0 or a 

5 non-negative value) to a maximum value of 22j^_i . In the present embodiment, 
assuming that the original look-up table is linearly spaced and has a threshold 
interval of Zj , the table indexes z of modified table 900 are separated by an 

interval defined as 2'-^''^^^^^^^, where the notation \_xj represents the largest integer 
value not greater than x . Thus, the table indexes z are defined as foUow^s: 

10 

Zq = 0; 

z^ =1x2^^^^^^^'^^; 
Z2 -2x2L^^^^^"^^J; 

For example, if the table threshold interval Zj of the original look-up table is 30, 
then log2 f 30^ = 4. 9 and the largest integer not greater than 4.9, or L4.9J = 4 . Then 
15 represents a threshold value of 2^ = 16 with respect to the original table. 

Similarly, ^2 represents a threshold value of 32 with respect to the original table. 
The table index values z are shown in data field 902 of modified table 900 in 
Figure 9. 

In other embodiments, the table threshold interval of the original look-up 
20 table is not linearly spaced. In that case, the interval Zj may be selected from any 
of the table threshold interval values of the original look-up table. Typically, the 
smallest table threshold interval of the original look-up table is chosen as the 
interval Zj, 
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In the present embodiment, the computed table values a^, aj , and a2;v-i 
(data field 904) for each of table indexes , z^^ and Z2j^^x are derived 
using linear interpolation based on the original computed table values ao, 
and and the original table indexes zo, zi, and zn-i- Of course, the computed 
5 table values Gq, a^, and ^2;v-i ^^^o be generated by evaluating the function 

log(l+e*^) at each of the nev^ table index values in data field 902. For the purpose 
of turbo decoding using the Log-MAP algorithm, the computed table values 
generated by linear interpolation are usually satisfactory, 

hi the above descriptions, the modified look-up table 900 is derived from 
10 an original look-up table. Of course it is also possible to directly generate 

modified look-up table 900 by specifying the range of table indexes z^,Zy, 2^2 ' 
s 5 and ?2Af-i computing the corresponding computed table values a^, a^, and 

; i-j ^2N-\ each of the table indexes. 

Modified look-up table 900 is addressed using table addresses which are 
15 the sequential order of the table entries from 0 to 2N-1 . In Figure 9, look-up table 
J 900 includes a third data field 910 containing the table addresses ZAddr, where the 

; rj address value corresponding to each of the computed table values in data field 904. 

In accordance with the present invention, the computed table values a^, aj , and 

^2N^\ (d^ta field 904) of table 900 are retrieved from table 900 for a given index 
20 value z using table addresses ZAddr as in a conventional memory. 

Thus, to address modified look-up table 900, a table address ZAddr is 
extracted fi-om an index value z which is computed from the argument values xi 
and X2 according to the equation z = |xi- X2j. The argument values xi and X2 are 
derived from the input data in the frame buffer of the turbo decoder. To address 
25 table 900 having 2N entries, the number of address bits required, is given as 
follows: 

m - log2(2N). (xi) 
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Furthermorej for the purpose of indexing the look-up table, the first n bits of the 
table address ZAddr is dropped, where n is given as: 

n = Llog2(zi) J. (xii) 

5 

Accordingly, the value n denotes the number of bits needed to represent the 
interval of the modified table 900 in binary number. Assuming that the index 
value z has M bits, after the index value z is computed based on argument values 
xi and X2, m number of data bits from bit n to bit n+mA of index value z are used 
10 as the table address to directly access look-up table 900. The m-bit table address in 
rp table index value z is illustrated as follows: 

~ Z^- table _ address 

J t ^ 

J 'J M -\,M ~2y.,.,n + m,n^'m-Xn'\'m~2,.,.,n + \n,n-\,n-2,,.,XS) , 



1 5 Bits 0 to n-1 of table index value z represent values within one table threshold 
interval value Zi, where bit n-l is the most significant bit representing the table 
threshold interval value. Thus, the m bits of index value z more significant than bit 
nA are used as address bits. By using the m-bit table address to directly address 
modified look-up table 900 as in a memory, no comparison is needed and the table 

20 look-up operation can be performed with a much greater efficiency. Furthermore, 
the complexity of the hardware implementation of the turbo decoder is also 
reduced since comparators are no longer needed for the table look-up operations. 
Note that if a non-zero bit is detected in bits A/ - 1, M - 2,..., w + w in table index 
value z, then the last table entry (or the table entry with the largest table index 

25 value) is used regardless of the value of the address bits n-^m-\„.,,n + l,n . 

In the above description, the look-up table addressing scheme is described 
with reference to a specific application in turbo decoding. However, this is 
illustrative only and a person of ordinary skill in the art would appreciate that the 
look-up table addressing scheme of the present invention can be applied to any 

30 mathematical computations involving the use of a look-up table for simplifying the 
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computation and enhancing the speed of the computation. The look-up table 
addressing scheme of the present invention can be implemented in software or 
other means known in the art. The above detailed descriptions are provided to 
illustrate a specific embodiments of the present invention and are not intended to 
5 be limiting. Numerous modifications and variations within the scope of the present 
invention are possible. 

Look-up Table Index Value Generation in a Turbo Decoder 

In the turbo decoding process based on the Log-MAP algorithm described 

1 0 above, an N-entry look-up table is used as a correction fimction to the 

maximization of the argimaent values xi and X2 as shown in equation (ii) or (iv). 
During the turbo decoding process, the look-up table, whether a scaled version or 
an unsealed version, is accessed continuously to compute the probability 
calculations, including the backward, forward and extrinsic probabilities. For each 

1 5 probability calculation, the decoding process first computes an index value z based 
on the argument values xi and X2 for which the correction value in the look-up 
table is desired. In the present description, index value z is defined to be: z = |xi- 
X2I. Then, the index value z is used to address or access the look-up table for 
retrieving a correction value from the data field containing the computed table 

20 values. Of course, as described above, the look-up table can be addressed in one of 
several ways. The index value z can be compared with each of the table index 
values mtil the correct table threshold range is found. On the other hand, 
according to the look-up table addressing scheme of the present invention, a 
modified look-up table can be generated and a portion of the bits in the index value 

25 z can be used as address bits to address the modified look-up table. 

To compute index value z, the decoding process computes the difference 
between the argument values xi and X2 and then takes the absolute value of the 
difference to generate the index value z. Typically, argument values xi and X2 are 
represented as signed numbers expressed in 2's complement format. Figure 10 is a 

30 block diagram illustrating one exemplary implementation of an index value 

generation circuit for computing the index value z = |xi-- X2|. First, to compute the 
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difference of xi and X2, circuit 1000 takes the 2's complement of argument value X2 
to obtain its negative value. This is done by inverting each bit of argument value 
X2 using inverter 1002 and then adding a value of 1 (on line 1003) to the inverted 
value. Then, the negative value of argument value X2 is added to argiiment value 
5 xi (on line 1 004). The two summation steps are performed by aa adder 1006 to 
provide the difference value xj- X2 in M bits on output line 1008. The 
computation of index value z then proceeds with taking the absolute value of the 
difference xi- X2 (on line 1008). The most significant bit (MSB) of the difference 
xi- X2 (on line 1009) is provided to a multiplexer 1016. Since the difference xi- X2 

10 is expressed in 2's complement, the MSB is a sign bit indicating whether the 
difference is a positive value (MSB=0) or negative value (MSB=1). If the 
difference is a positive value, then multiplexer 1016 selects the difference value on 
line 1008 as the absolute value of the difference. The index value z = |xi- X2I 
having M bits is provided on output bus 1018. If the difference is a negative value, 

1 5 then taking the absolute value involves reversing the sign of the difference value. 
This is done by taking the 2's complement of the difference xi- X2. Thus, a second 
inverter 1010 and a second adder 1012 are provided to invert the bits of difference 
value xi- X2 and then adding a value of 1 (on line 101 1) to the inverted value. The 
output of adder 1012 is the absolute value of the difference xi- X2. Multiplexer 

20 1016 selects the 2's complement value of the difference xi- X2 computed on bus 
1014 when the difference is a negative number. 

The straightforward implementation shown in Figure 10 for computing 
index value z = jxi- X2I has several shortcomings. First, circuit 1000 requires two 
M-bit full adders 1006 and 1012. Because adders typically consume a large circuit 

25 area, the two-adder implementation of Figure 10 is not space efficient and, when 
implemented in an integrated circuit, consumes a large amount of silicon area, thus 
increasing the manufacturing cost. Second, circuit 1000 in Figure 10 has a 
undesirably slow speed of operation because of a long critical path. Specifically, 
the critical path includes input argument value X2 provided to inverter 1002, adder 

30 1006, inverter 1010, adder 1012 and finally multiplexer 1016. Because the turbo 
decoding process requires generating index value z repeatedly, it is desirable that 
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index value z be computed quickly and that the index value generation circuit be 
space efficient. 

According to another aspect of the present invention, an implementation for 
computing the index value z — |xi- X2I in a turbo decoder is provided. Figure 1 1 is 
5 a block diagram illustrating an index value generation circuit for computing the 
index value z = [xi— X2I according to one embodiment of the present invention. In 
circuit 1 100 of Figure 11, the difference Xi- X2 is computed in the same manner as 
in circuit 1000 of Figure 10. Basically, inverter 1 102 and adder 1 106 are used to 
take the 2's complement of argument value X2 and then sum the negative value of 

10 X2 to argimient value x\. An M-bit value of the difference Xi- X2 is provided on 

line 1 108 to a multiplexer 1116. The operation of multiplexer 1 1 16 is analogous to 
circuit 1000 of Figure 10. In Figure 11, v^hen the difference xi- X2 is a negative 
number, the absolute value of the difference xi- X2 is computed by taking the I's 
complement. That is, the difference xi- X2 is inverted by an inverter 1110 and the 

1 5 inverted value is taken as the absolute value of the difference xi- X2. The 

implementation in Figure 1 1 saves valuable circuit space by eliminating the need 
for a second adder such as adder 1012 in Figure 10. The speed of operation is also 
improved by eliminating a second adder in the critical path. 

In effect, the implementation in Figure 1 1 eliminates the second adder by 

20 omitting the "addition of 1" operation required in taking the 2's complement to 

obtain the absolute value of the difference xi- X2 when the difference is a negative 
value. Therefore, when the difference xi- X2 is a negative number, the output 
value |xi- X2I on bus 1118 will be off by the value of 1 . However, this discrepancy 
in the output value |xi- X2I is insignificant in the turbo decoding process and in 

25 most cases, the discrepancy does not affect the accuracy of the probability 

calculations at all. It is important to note that the index value z is used to address 
an N-entry look-up table including N entries of table index values or table 
threshold values, such as zo, zi^ and zn-i of table 100 of Figure 1. Thus, even if 
the index value z is off by 1, hi most cases, the table look-up operation v^U still 

30 return the same computed table values because the index value z, whether off by 1 
or not, will still fall within the same threshold range in the look-up table. The only 
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time when the off-by-1 discrepancy will cause a different computed table value to 
be returned is when the index value z falls on the boundary of a threshold value 
such that the off-by-1 index value z will return a different computed table value as 
the precise index value z. Because this situation occurs infrequently, the 
5 approximation made in Figure 1 1 gives negligible performance degradation while 
providing significant improvement in silicon area consumption and in speed 
enhancement. In one embodiment, the silicon area required to implement the 
circuit in Figure 11 is reduced by 40% than the area required to implement the 
circuit in Figure 10. Moreover, the critical path in the circuit of Figxire 1 1 is 
10 shortened by 40% as compared to the critical path in the circuit of Figure 10. 

These advantages of the table index value generation circuit described herein has 
not been appreciated by others prior to the present invention. 

A Stop Iteration Criterion for Turbo Decoding 

15 As described above, turbo decoding is an iterative process. For example, in 

the two-stage iterative decoding process illustrated in Figure 3, decoder 332 
computes a posteriori information PI (provided on bus 322) which is interleaved 
and provided to decoder 334. Decoder 334 in turn computes a posteriori 
information P2 (provided on bus 323) which is deinterleaved and provided back to 

20 decoder 332. T)^ically, the decoding process repeats for a sufficient number of 
iterations to ensure that the bit decisions converge. The resulting bit decisions for 
the input data are then provided by decoder 334 on bus 228. However, because the 
number of iterations needed differs depending on the signal-to-noise ratio (SNR) of 
the received input data and the frame size of the data, the number of iterations 

25 chosen is often either too many or too few, resulting in either inefficiency in the 
decoding process or inaccurate bit decisions. 

Ideally, a stop iteration criterion based on monitoring the convergence of 
the likelihood fimction can be used. Thus, at each iteration of the turbo decoding 
process, the decoder monitors the a posteriori probability values computed by each 

30 constituent decoder for each bit in the input data. When the probability values 
converge, the iteration is stopped and the bit decisions are outputted by the turbo 
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decoder. However, using the convergence of the likeUhood function as a stop 
iteration criterion is inefficient because it requires a significant amount of 
processing. 

According to another aspect of the present invention, a stop iteration 
5 criterion for turbo decoding is provided where the turbo decoder monitors the bit 
decisions from each constituent decoder for each data bit at each iteration and 
ceases further iterations when the bit decisions converge. When turbo decoder 200 
of Figures 2 and 3 of the present invention incorporates the stop iteration criterion 
of the present invention, significant improvement in the decoding performance can 
10 be observed. For instance, by stopping the iteration early, the turbo decoder 
^ portion of the circuit can be shut down, thus, conserving power. Figure 12 is a 

' tf block diagram of a turbo decoder incorporating the stop iteration criterion of the 

Q present invention in its decoding operation according to one embodiment of the 

], [ i present invention. Figure 13 is a block diagram illustrating a complete iteration of 

^: 15 the decoding operation of the turbo decoder of Figure 12. Turbo decoder 1200 of 

Figures 12 and 13 is constructed in the same manner as turbo decoder 200 of 
Figures 2 and 3. Like elements in Figures 2, 3, 12 and 13 are given Uke reference 
numerals and will not be further described. 

Referring to Figure 12, turbo decoder 1200 includes a decoder 204 which 
20 outputs bit decisions on output bus 228. Turbo decoder 1200 further includes a 
buffer 1250, a deinterleaver 1252 and a comparator 1254. As explained above, in 
the actual implementation of turbo decoder 1200, only one elementary decoder 
(decoder 204) is needed to perform the decoding operations. Thus, decoder 204 is 
used repeatedly for decoding the constituent codes for each stage of the decoding 
25 process. In turbo decoder 1200, buffer 1250 is provided for storing the bit 

decisions computed for each decoding stage during an iteration of the decoding 
process so that the bit decisions can be compared at the completion of the iteration, 
as will be described in more detail below. In the present description, one iteration 
is defined as the processing starting with the first decoding stage through the last 
30 decoding stage, each decoding stage operating on its ovm constituent code. 

Deinterleaver 1252 is provided to deinterleave bit decisions from decoding stages 



-27- 



M(H|p|lin^|ii|iWim(ii(r[ff(f(' Iff 



M-7976 US 
777791 vl 

Operating on interleaved systematic information. Finally, comparator 1254 
monitors and compares the bit decisions generated at each iteration to determine if 
further iteration is required. 

Referring to Figure 13, decoder 332 and decoder 334 each operates on its 
5 own constituent codes and computes tentative bit decisions based on the respective 
constituent codes. In accordance with the present invention, after the processing of 
the constituent codes in each iteration, the turbo decoder proceeds to compare the 
tentative bit decisions computed by each of constituent decoders to determine 
whether the bit decisions from each of the decoders are the same. When the 

10 tentative decisions from each of the constituent decoders within the same iteration 
are the same, the turbo decoder stops further iterations and the bit decisions are 
provided as the final bit decisions. In turbo decoder 1200 including two 
constituent decoders, tentative bit decisions from decoder 332 and tentative bit 
decisions from decoder 334 are compared at each iteration by comparator 1254. 

1 5 The bit decisions from decoder 334 have to be deinterleaved by deinterleaver 1252 
before being compared with the bit decisions from decoder 332, If decoders 332 
and 334 are used recursively to decode other constituent codes in the decoding 
process, then the bit decisions are first stored in buffer 1250 and the bit decisions 
are not compared until all bit decisions in mi iteration of the decoding process has 

20 been completed. 

For instance, if the bit decisions from decoders 332 and 334 are the same, 
then comparator 1254 outputs a command on bus 1256 (Figure 12) to instruct turbo 
decoder 1200 to stop the decoding iterations. Bit decisions on either decoder 332 
or 334 are outputted as the final decoding result. If the bit decisions are not the 

25 same, turbo decoder 1200 continues with the next iteration of the decoding 
process. The stop iteration criterion of the present invention provides 
improvement in decoding performance without compromising decoding accuracy. 
In turbo decoding, since bit decisions are based on the likelihood function, the 
convergence in bit decisions implies that the likelihood function has converged 

30 sufficiently such that the bit decisions are not affected from one decoder to the next 
decoder in the same iteration. Therefore, when the bit decisions converge in a 
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given iteration, any further iteration will not improve the accuracy of the bit 
decisions and thus, the iteration can be stopped. 

The stop iteration criterion of the present invention can be applied to 
parallel concatenated turbo decoders with an arbitrary number of constituent codes 
5 and frame sizes. For a turbo decoder consisting of N constituent decoders, the 
tentative bit decisions in the iteration for the decoder at a time index m is 
denoted as d(m, k^n) . Here, the time index m is used to identify a data bit in the 
systematic input data s{m) which is normalized by the SNR. In the following 
equations, a numeral subscript on m (such as m\ and m2) denotes the same data bit 
10 being associated with the respective constituent decoder (such as decoder 1 and 
decoder 2). With respect to the N constituent decoders, the tentative bit decisions 
are given as follows: 

Decoder 1 : 

15 d{m, kX) ~ sgn 2s{rn) + p{m^ , k,\) + ^ p{m^ , - 1, n) 

Decoder 2: 



/ 2 N 

d (m, k,2) = sgn 2s{m) + ^ p(m„ , /t, n) + ^ p(m„ , A: - 1, w) 

V «=1 «=3 



Decoder 3: 



d(m, k,3) = sgnj 2s{m) + ^ p{m^ ,k,n) + Y, P(^n , >t - 1, j ; and 



20 Decoder N: 

d (m, k, N) = sgnl 2s(m) + ^ p(m^ , k, ri) 



where the function sgn(«) is the sign function and the function p(^) represents the a 
posteriori probability calculated by the respective decoder. When the parameter 
25 operated on by sgn(«) is a negative number, sgn(«) returns a value of "1". When 
the parameter operated on by sgn(») is a positive number, sgn(*) returns a value of 
"0". According to the stop iteration criterion of the present invention, if 
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d{m,kX) = d{m,k,2) = d(m, A:,3) = • • • = d(m, N) for m = 1,2, • • ■ , M , where M is 
the frame size of the input data, then the turbo decoder stops the decoding iteration 
and outputs the bit decisions. 

For turbo decoder 1200 consisting of two constituent decoders, the tentative 
5 bit decisions on the k^^ iteration for the two constituent decoders are given by: 

d(m, k,l ) = sgn{2s( m) + p(m^,kX) + p(m2,k - 1,2 j), and 
d(m, k,2 ) = sgn{2s( m) + p(m^,k,\)'^ p(m2, k,2 )). 

If d{m,k,\) = d(m,k,2) for m = 1,2,- • -M , turbo decoder 1200 can stop the 
10 iteration and output the bit decisions. The stop iteration criterion of the present 
invention can be incorporated in any turbo decoders, including turbo decoders 
applying the MAP, Max-Log-MAP or Log-MAP decoding algorithm, to improve 
the decoding performance. 

The above detailed descriptions are provided to illustrate specific 
15 embodiments of the present invention and are not intended to be limiting. 

Numerous modifications and variations within the scope of the present invention 
are possible. The present invention is defined by the appended claims. 
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